Data Analytics Unit

Quarterly Orientation

June, 2023

“If we have data, let’s look at the data. If all we have are opinions, let’s go with mine.”
Jim Barksdale

The DAU

  • Small but growing team
  • More of an horizontal coordination
  • Data focused

What do we do?

  • Data collection through non-traditional methods
  • Data wrangling and visualization
  • Workflow developments
  • Research

What do we do?

  • Data collection through non-traditional methods
  • Data wrangling and visualization
  • Workflow developments
  • Research



We are not involved in the calculation of index scores

Data collection

  • Web Scraping is the process in which we collect information from the web with the objective to export it as an organized data structure that would fit our needs.
  • Automatized process through scrapper bots.
  • Data collected:
    • EU Lawyers Data: Lawyer information from national and regional Bar Associations across 27 EU countries.
    • Political News Data: Headlines, description and corpus of around 100,000 news from the political columns of 12 major newspapers in the EU.
  • Geocoding is the process of transforming a description of a location into geographical coordinates such as longitude and latitude.
  • Python/R APIs from online mapping services such as Google Maps or Open Street Maps.
  • Implemented in the EU Lawyers Data.

Data wrangling and visualization

  • Data wrangling is the process of transforming unstructured or “dirty” data into a tidy and structured version ready for analysis.
  • Process is outcome specific:
  • Multidisciplinary process
  • Visualization should be data driven and outcome oriented.
  • Outcomes worked:
    • Charts
    • Infographics
    • Interactive dashboards
    • Dynamic visualizations

Workflow developments

Anything that can be automated, should be automated. Do as little as possible by hand. Do as much as possible with functions.
Hadley Wickham

  • An unified workflow that allows different members of the team to work together in the same project (and even the same code) in an harmonious way but with little interaction between members.

  • Examples

    • Charts automation
    • Reports mass production

How do we do it?

  • Data cleaning
  • Data exploration
  • Intensive data cleaning
  • Data Visualization
  • Workflow developments
  • Geospatial manipulation
  • Statistical Models
  • Webscrapping
  • Machine Learning Models
  • Natural Language Processing
  • Tools development
  • HTML & CSS
    • Online reports
    • Aesthetic manipulation
  • Markdown & Quarto
    • Documentation
    • Presentations
    • Internal reports

Resources

Thank you for your attention